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Abstract 

We develop a simple theoretical framework for the evolution of weighted 
networks that is consistent with a number of stylized features of real- world 
data. In our framework, the Barabasi-Albert model of network evolution 
is extended by assuming that link weights evolve according to a geometric 
Brownian motion. Our model is verified by means of simulations and real 
world trade data. We show that the model correctly predicts the intensity 
and growth distribution of links, the size- variance relationships of the growth 
of link weights, the relationship between the degree and strength of nodes, 
as well as the scale-free structure of the network. 
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1 Introduction 



Graph theory has been used to describe a vast array of real-world phenomena, but 
only recently the attention has shifted from binary to weighted graphs, from both 
an empirical [U El El lU El E] and a theoretical perspective [H El E] • The empirical 
literature has come up with a number of robust stylized facts that apply to a 
wide range of phenomena as different as Internet traffic, airport connections, and 
international trade. In particular, it has been demonstrated that weighted graphs 
display (i) a power law connectivity distribution P(K), with finite size truncation 
[TU1I5]; (ii) a skewed distribution of link weights P(w) and node strengths measured 
as the sum of the weights of the links of a given node P(W) [HI [12]; (iii) a power- 
law relation between node strength W and node degree K: W = K e , with 9 
ranging between 1.3 and 1.5 pTJl [T3] . 

In this paper we present a simple stochastic model of proportionate growth of 
both the number and the weight of links to describe the structure and evolution of 
weighted networks and account for above mentioned regularities. In our setup we 
extend the Barabasi and Albert (BA) model [H] to accommodate weighted network 
dynamics. This is done exploiting the theoretical framework recently put forward 
by Stanley and co-authors to explain the scaling distribution of fluctuations in 
complex systems [T^l ESI EH] • 

We test our model using data on the network of international trade flows, 
which is a prototypical example of a real-world network that is inherently weighted. 
International trade flows have been traditionally analyzed in the contest of the so- 
called gravity model [18] that relates bilateral flows to countries' size and distance. 
However, one of the main limits of this approach is its inability to capture the large 
fraction of zeros existing in the matrix of bilateral links. Although this has recently 
been addressed in the context of standard economic theory [15] . graph theory as 
been applied to naturally accommodate this feature of the data. 

We selected the international trade network (ITN) as a testbed for our model 
based on the following considerations. First, the ITN has been already extensively 
investigated El HH ED El E21 El], and previous works on the ITN provide 
us with a rich set of empirical regularities. Thus, we know that the link weight 
distribution assumes a log-normal form in the case of the ITN [231 El], whereas 
their growth rates display fat tails [21]. Second, the relationship between node 
strength and degree is crucial in the economic literature about ITN since it is 
related to the interplay between intensive and extensive margins of trade, which 
is key to explain trade flows [22] Third, despite the structural inertia of the 
ITN, the huge volatility of trade flows after the 2008 global financial crisis has 

lr The extensive margin consists of the number of trading partners and the number of products 
exported K, whereas the intensive margin represents the amount shipped per product per country 
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recently attracted a great deal of attention. Our theoretical framework provides 
an explanation for the relationship between node centrality and the variance of 
network flows. 

The paper is organized as follows. Section [2] presents the model and its most 
important predictions. We then test our model using data on the ITN (Section 
[3]) and simulations (Section H]). Finally, in the last Section we lie down some 
conclusions and outline possible patterns for future research. 

2 The model 

Barabasi and Albert [H] have proposed a simple stochastic model of network 
growth based on preferential attachment which accounts for many of the stylized 
facts observed in real-world networks. The increasing interest in the study of 
weighted versions of networks calls for an extension of the original BA model to 
account for the large degree of heterogeneity across link weights [9] . The route 
we take here exploits the theoretical framework recently put forward by Stanley 
and co-authors [16J to deal with the growth dynamics of complex systems. We 
prove that our model is capable to accurately match the structural properties that 
characterize a number of real-world weighted networks. 

We therefore propose a generalized version of the BA model to describe the 
dynamic and growth of weighted networks, by modeling them as a set of links of 
different weight occurring among nodes. In particular, we assume that the weight 
of links grows according to a geometric Brownian motion (also known as Gibrat's 
law of proportionate effects |26j), so that the expected value of the growth rate of 
link weights is independent of their current level. 

The key sets of assumptions in the model are the following [Til EZl HE] : 

1. The network begins at time t — with iVo nodes each with a self loop. At 
each time step t = {1, . . . , M}, a new link among two nodes arises: thus the 
number of links (excluding self-loops which are used only for initialization) 
existing at time t is mt — t. We write Ki(t) for the number of links of node i 
at time t (node degree). To identify the nodes connected by the newly formed 
link at time t we adopt the following procedure: with probability a the new 
link is assigned to a new source node, whereas with probability 1 — a it is 
allocated to an existing node i. In the latter case, the probability of choosing 
node % is given by: piit) = Ki(t—l)/2t. Edge endpoints i and j of the new link 
are chosen symmetrically with % ^ j. Thus with probability a the new link 
is assigned to a new target node, while with probability 1 — a it is allocated 
to an existing node with probability Pj(t) = Kj(t — 1) / (2t — Ki(t — 1)) if 
j 7^ i and Pj(t) = otherwise. Hence, at each time t this rule identifies the 
pair of (distinct) nodes to be linked; 
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2. at time t each existing link between nodes % and j has weight Wij(t) > 0, 
where Ki, Kj and Wij are independent random variables. At time t + 1 the 
weight of each link is increased or decreased by a random factor Xij(t), so 
that Wjj(t + 1) = Wij(t)xij(t). The shocks and initial link weights are taken 
from a distribution with finite mean and standard deviation. 

Thus we assume that each link weight grows in time according to a random pro- 
cess. Moreover the two processes governing link formation and weight growth 
are assumed to be independent. We therefore combine a preferential attachment 
mechanism (Assumption 1), with an independent geometric Brownian motion of 
link weights (Assumption 2). In this way we obtain a generalization of the BA 
setup capable to account for the growth of weighted networks. 

Based on the first assumption we derive the degree distribution P(K) [HI EE]. 
In the absence of the entry of new nodes (a = 0) the probability distribution of 
the number of links at large t, i.e. the distribution P{K), is exponential: 

P(K)*±eM-K/K), (1) 

where K = 2t/N is the average number of links per node, which linearly grows 
with timeJl 

If a > 0, P{K) becomes a Yule distribution which behaves as a power law for 
small K: 

P{K) ~ K~ v , (2) 

where (f = 2 + a/(l — a) > 2, followed by the exponential decay of Eq.([T]) for large 
K with K = (1 + 2t/N ) 1 - a - 1 [T5]. 

Hence, in the limit of large t when a = (no entry), the distribution of P(K) 
converges to an exponential; on the contrary when a > and small the connectivity 
distribution at large t converges to a power-law with an exponential cutoff [15J. 

By using the second assumption we can compute the growth rate of the strength 
of nodes. The strength of node i is given by Wi = Yl w ij- The growth rate 

Ki 

is measured as g — \n(W(t + 1)/W(t)). Thus, the resulting distribution of the 
growth rates of node strength P(g) is determined by 

oo 

P(g)^J2 P ( K ) P (9\n (3) 

K=l 

where P(K) is the connectivity distribution, computed in the previous stage of the 
model and P(g\K) is the conditional distribution of growth rates of nodes with 
given number of links determined by the distribution P(w) and P(x). 

2 K does not include initial self loops. 
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Fu and colleagues [16] find an analytical solution for the distribution of the 
growth rates of the weights of links P(g) for the case when a — > and t — > oo, 

2V 

P{g) « 3 . (4) 

-P(g) has similar behavior to the Laplace distribution for small g i.e. P(g) ~ 
exp(— V2\g\/ \/Vg)/ y/^Vg, while for large g, P(<?) has power law tails P(g) ~ g~ 3 
which are eventually truncated for g —>■ oo by the distribution P(x) of the growth 
rate of a single link. 

A further implication of the model that can be derived from the second as- 
sumption concerns the distribution of link weights P(w). The proportional growth 
process (Assumption 2) implies that the distribution of the weights P(w) converges 
to a log-normal. Thus node strength W is given by the sum of K log-normally 
distributed stochastic values. Since the log-normal distribution is not stable upon 
aggregation, the distribution of node strength P(W) is multiplied by a stretching 
factor that, depending on the distribution of the number of links P{K) could lead 
to a Pareto upper tail [29J. 

Moreover, a negative relationship exits among the weight of links and the vari- 
ance of their growth rate. Our model implies an approximate power-law behavior 
for the variance of growth rates of the form a(g) = PF _/3(W ) where /3(W) is an 
exponent that weakly depends on the strength W. In particular, (3 = for small 
values of W, j3 = 1/2 for W — > oo, and it is well approximated by (3 ~ 0.2 for a 
wide range of intermediate values of W [17] . 

Finally, the model yields a prediction also on the relation between the degree K 
and the strength W of each node. In SectionH]we show that since the weight of each 
link is sampled from a log- normal distribution (w are log- normally distributed), 
and given the skewness of such a density function, the law of large numbers does 
not work effectively. In other words, the probability to draw a large value for a 
link weight increases with the number of draws, thus generating a positive power 
law relationship between W and K, for small K. 



3 The Empirical Evidence 

To test our model we use the NBER-United Nations Trade Data [31] available 
through the Center for International Data at UC Davis. This database provides 
bilateral trade flows among countries over 1962-2000, disaggregated at the level of 
commodity groups (4-digit level of the Standard International Trade Classification, 
SITC). Data are in thousands US dollars and, for product-level flows, there is a 
lower threshold at 100,000 dollars below which transactions are not recorded. One 
point to note is that disaggregated data are not always consistent with country 
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trade flows: in a number of cases we do not observe any 4-digit transaction recorded 
between two countries, but nevertheless find a positive total trade, and vice-versa. 
Since we take the number of product traded among any two country pairs as the 
empirical counterpart of the number of transactions, to avoid inconsistency we 
compute the total trade by aggregating commodity-level data. 

In this Section we test the predictions of our model while in the following 
Section, we use the data to calibrate the simulations and check for the ability of 
the model to replicate real-world phenomena by comparing simulated and actual 
trade flows. We already know from previous work [21] that the main features of 
the ITN are broadly consistent with our model. Here we look in more details at 
some specific characteristics of ITN. 




Figure 1: Distribution of the number of products traded - 1997. Double logarith- 
mic scale (main plot) and semi logarithmic scale (inset) 



Figure [T] shows that the distribution P(K), that is the number of 4-digit SITC 
products traded by countries, is power-law distributed with an exponential cutoff. 
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The main plot displays the probability distribution in log-log scale, whereby the 
power-law is the straight line body, and the exponential cutoff is represented by 
the right tail. The inset presents the same distribution in semi-log scale: this time 
it is the exponential part of the distribution that becomes a straight line, so that 
with this trick we can magnify what happens to the probability distribution as 
K grows large. As discussed in Section [2] above, the power law distribution of K 
hints at the existence of moderate entry of new nodes in the network. Indeed, 17 
new countries enter into the ITN during the observed time frame, mostly due to 
the collapse of the Soviet Union and Yugoslavia. 




Figure 2: The distribution of the link weights and node strength in year 1997. 
Complementary cumulative distribution of the strength distribution P(W) (ag- 
gregate flows) and link weights P(w) (commodity flows) and their power-law fits 
(dashed lines) 

Moving to the weighted version of the network, one can look at the distribution 
of positive link weights as measured by bilateral trade flows at the commodity level, 
P(w), as well as the total value of country trade or node strength P(W). Figure 
[2] shows the complementary cumulative probability distribution of trade flows in 
log-log scale, both for product-level transactions and for aggregate flows. Figure 
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prefers to 1997 data (other years display the same behavior). We observe that 
both distributions show the parabolic shape typical of the log-normal distribution, 
thus conforming to previous findings [231 EI]. As predicted, upon aggregation the 
power-law behavior of the upper tail become more pronounced [29] . However, this 
departure from log- normality concerns a very small number of observations (0.16% 
in the case of commodities flows, 2.21% for aggregate flows) since only a few new 
nodes (countries) enter the network over time. 

As for the growth of trade flows, Figure [3] shows the empirical distribution 
P(g) together with the maximum likelihood fit of Eq.(j3]) as well as the Generalized 
Exponential Distribution (GED, with shape parameter 0.7224). 



%r empirical 




log(g) 



Figure 3: Distribution of the growth rates of aggregate trade flows P(g) 

Goodness of fit tests, reported in Table [TJ show that P(g) is neither Gaussian 
nor Laplace, whereas the distribution in Equation (j3J) performs much better in 
terms of Kolmogorov-Smirnov (KS) and Anderson-Darling (AD) testsJl Hence, 

3 The KS and the AD are non-parametric tests used to evaluate whether a sample comes from 
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Table 1: Kolmogorov-Smirnov (KS) and Anderson-Darling (AD) goodness of fit 
tests for the distribution of growth rates of trade flows P(g) 



Distribution 


Mean 


Variance 


KS 


AD 


Gauss 


0.0333 


0.8417 


10.8305 


117329 


Laplace 


0.0040 


0.5338 


2.8414 


1.4107 


GED (shape parameter 0.72) 


0.0444 


0.2899 


1.0915 


0.0314 


Equation flU) 


0.0651 


0.3658 


0.8214 


0.0477 



the growth of nodes centrality, as measured by strength W, follows the same law 
of the fluctuations of the size of complex systems [161 35J . This is not surprising, 
since the size of an airport can be measured by the number of the passengers that 
travel through it and the size of a firm in terms of sales is given by the sum of 
the value of each product it sells. Thus the theoretical framework of Stanley and 
colleagues [TB] complements and completes the BA proportional growth model in 
the case of weighted networks. 

As discussed in Section [2J our model implies a negative relationship between 
node strength and the variance of its growth rate. Figure H] reports the standard 
deviation of the annual growth rates of node strength (g) and their initial magni- 
tude (W). The standard deviation of the growth rate of link weights exhibits a 
power law relationship a(g) = W' 13 with /3 ~ .2, as predicted by the model [T7] . 
This implies that the fluctuations of the most intense trade relationships are more 
volatile than expected based on the central limit theorem. 

All in all, our model accurately predicts the growth and weight distribution of 
trade flows, the number of commodities traded and the size- variance relationship of 
trade flows. Thus we can conclude that a stochastic model that assumes a propor- 
tional growth of the number of links combined with an independent proportional 
growth process of link weights can reproduce most of the observed structural fea- 
tures of the world trade web and should be taken as a valid stochastic benchmark 
to test the explanatory power of alternative theories of the evolution of interna- 
tional trade and weighted networks in general. In the next section we will compare 
the structure of random networks generated according to our model and with the 
real world trade network. 

a population with a specific distribution. Both KS and AD tests quantify a distance between 
the empirical distribution function of the sample and the cumulative distribution function of 
the reference distribution. The AD test gives more weight to the tails than the KS test. More 
detailed information are available in [33] . 
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Figure 4: Size-variance relationship between nodes strength W (trade values) and 
the standard deviation of its growth rate cr(g), double logarithmic scale 



4 Simulation Results 

Based on the assumptions in Section [2] we generate a set of random networks 
and fit them with real world data in order to test the predictive capability of 
our theoretical framework. We will proceed in two steps. First, we generate the 
unweighted network according to the first set of assumptions. Next, we assign 
the value of weights based on a random sampling of K values from a log-normal 
distribution P(w) whose parameters are obtained through a maximum likelihood 
fit of the real world data. 

We model a system where at every time t a new link is added, which represents 
the possibility to exchange one product with a trading partner. We slightly modify 
the original setting in order to account for the possibility that the new links could 
be assigned randomly rather than proportionally to node connectivity. Thus in 
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our simulations the parameter a governs the entry of new nodes according to 
Assumption 1, whereas parameter b is the probability that a new link is assigned 
randomly. Thus with probability a the new link is assigned to a new source node, 
whereas with probability I— a it is allocated to an existing node i. In the latter case, 
the probability of choosing node i is now given by Pi(£) = (l—b)K i (t—l)/2t+b/N t -i 
where N t _i is the number of nodes at time t — 1. The target of the new link is 
chosen symmetrically with i ^ j. 

By tuning the two model parameters a and b we generate different networks 
in terms of the connectivity distribution of trade links P(K). In particular, with- 
out entry (a = 0) and completely random allocation of opportunities (b = 1) one 
obtains a random graph characterized by a Poisson connectivity distribution 
whereas allowing entry (a > 0) P{K) is exponentially distributed. Keeping a posi- 
tive entry rate, but assigning opportunities according to a preferential attachment 
model (b = 0) the model leads to a power-law connectivity distribution with an 
exponential cut-off which is more pronounced the higher is the number of initial 
nodes N . In the limit case in which entry of new nodes is ruled out (a = 0) then 
the connectivity distribution tends toward a Bose-Einstein geometric distribution. 

We compare the structure of random scale-free model networks with the real 
world trade network in 1997. Since the structure of the network is highly stable 
over time results do not change substantially if we compare simulations with the 
structure of the real world network in different years. In the first stage, we generate 
one million networks with a and b both ranging from to 1. We simulate random 
networks of 166 nodes (countries) and 1,079,398 links (number of different com- 
modities traded by two countries). The number of commodities traded is taken 
as a proxy of the number of transactions. Next we select the random networks 
that better fit the real world pattern in terms of correlation, as measured by the 
Mantel r test, and connectivity distribution]^] 

Figure[5]reports the value of the Mantel test for networks with < b < 1 and an 
entry rate a which implies the entry form to 66 countries. The Mantel correlation 
statistics reach a peak of .88 (p < .01) in the case of preferential attachment regimes 
(b = 0). However, the Mantel test does not discriminate among different entry 
regimes. We next compare the connectivity distribution of simulated networks 
with the real world distribution of the number of traded commodities P{K) by 
means of the Kolmogorov-Smirnov (KS) goodness of fit test. Figure [6] confirms 
that the best fit is obtained in the case of a purely preferential attachment networks 

4 The Mantel test is a non-parametric statistical test of the correlation between two matrices 
[37j . The test is based on the distance or dissimilarity matrices which, in the present case, 
summarize the number of links between two nodes in the simulated and real networks. A typical 
use of the test entails comparing an observed connectivity matrix with one posed by a model. 
The significance of a correlation is evaluated via permutations, whereby the rows and columns 
of the matrices are randomly rearranged. 
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number of new countries 
b (prob. of random assignment) 



Figure 5: Mantel test comparing simulated and real networks 



(6 = 0). However the KS tests provides additional information on the most likely 
value of a (entry rate of new nodes). 

Figure [7] shows that the our model can better reproduce the connectivity dis- 
tribution with and entry rate a > that implies the entry of 14-18 countries. This 
closely corresponds to the empirically observed number of new countries. Thus 
we can conclude that a simple proportional growth model with mild entry can 
account for the distribution of the number of commodities traded by each pair of 
countries. 

By introducing the value of the transactions we can show that the model gen- 
erates the observed relationship between intensive and extensive margins of trade. 
Figure [8] depicts the relationship between total trade flows (W) and the number of 
trade links maintained by each country (K) . Empirically, we proxy the number of 
transactions by means of the number of products traded by each country. Figure 
E] displays the relationship that emerges from 1997 trade data, and confirms that 
there exists a positive correlation between the two variables. The slope of the 
interpolating line (1.33) in double logarithmic scale reveals a positive relationship 
between the number of commodities and their average value of the kind W = K e 
with 9 « 1.33. 

The curve displays and upward departure in the upper tail. This can be ex- 
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number of new countries 



b (probability of random assignment) 



Figure 6: Kolmogorov-Smirnov goodness-of-fit test for different entry rates and 
probabilities of random assignment 

plained by noticing that the product classification that we used imposes a ceiling 
to the number of products a country can trade since there are only around 1,300 
4-digit categories (vertical dotted line)H 

Apart from the upper decile of the distribution, the simulated version of the 
network shows exactly the same dependence among the magnitude and the number 
of the transactions. This seems surprising by considering that the model assumes 
two independent growth processes for the number of transactions K and their 
values w. However, it should be noticed that the law of large numbers does not 
work properly in case of skew distributions such as the log-normal. Given a random 
number of transactions with a finite expected value, if its values are repeatedly 
sampled from a log-normal, as the number of links increases, the average link 
weight will tend to approach and stay close to the expected value (the average 

5 Another possible explanation is that for large enough K, some scale effects kick in establishing 
a correlation between K and W. This could be tested in other real networks with larger K and 
no cutoff. We are aware that our modeling strategy to assume away any relationship between the 
mechanisms governing the binary structure of the network and the one assigning link weights is 
as extreme as other strategies that simply assume a single process governing the two parts. Yet, 
we consider the positive relationship between K and W as an interesting emerging property of 
the model. 



13 



45 




10 1 1 1 1 1 1 1 

10 20 30 40 50 60 70 

Number of new countires (1992-2000) 

Figure 7: Kolmogorov-Smirnov goodness-of-fit test for different entry rates in a 
pure preferential attachment regime (b = 0) 

for the population). However this is true only for large K, while according to the 
distribution P{K) the vast majority of nodes has few links (small K). The higher 
is the variance of the growth process of link weights, the larger has to be K to 
start observing convergence toward W = wK with 9 = 1 predicted by the law of 
large numbers. Thus only the largest countries approach the critical threshold. In 
sum, our simulations demonstrate that our model can account for the relationship 
between K and W that has been observed in many real world weighted networks 

mansj. 

5 Discussion and Conclusions 

Using a simple model of proportionate growth and preferential attachment we are 
able to replicate some of the main topological properties of real-world weighted 
networks. In particular, we provide an explanation to the power-law distribution 
of connectivity, as well as for the fat tails displayed by the distribution of the 
growth rates of link weights and node strength. Additionally, the model matches 
the log-normal distribution of positive link weights (trade flows in the present con- 
text) and the negative relationship between node strength and variance of growth 
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Figure 8: The relationship between the number of products traded and trade value. 
Double logarithmic scale. Simulated (back) and real-world (red) data, mean and 
one standard deviation in each direction. The dashed line represents the reference 
line W = K e with 9 « 1.33 

fluctuations a(g) = W~P with (3 « .2. 

The main contribution of the paper is to offer an extension of the BA model 
for weighted networks. Besides, we provide further evidence that such a unifying 
stochastic framework is able to capture the dynamics of a vast array of phenomena 
concerning complex system dynamics [16J. 

Further refinements of our model entail investigating its ability to match other 
topological properties of the networks such as assortativity and clustering. 
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